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ABSTRACT 


The  "seismic  equalization"  problem  is  that  of  correcting  the 
response  at  one  station  to  match  that  at  another  station  which  may 
have  different  instrument  characteristics  and  different  (and  unknown) 
local  reverberation  characteristics.  In  this  note,  the  problem  of 
seismic  equalization  is  formulated  mathematically,  and  that  portion 
involving  measurement  or  estimation  of  a  transfer -function  ratio  is 
modeled  and  attacked  on  statistical  terms,  first  by  an  ad  hoc  pro¬ 
cedure  and  then  by  the  method  of  maximum  likelihood. 
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I.  INTRODUCTION* 


The  statistical  problem  examined  in  this  note  is  motivated  by  the  need  for 
"sensor  equalization"  in  seismic  array  processing.  To  define  "sensor  equalization" 
we  think  of  a  sensor  as  a  composite  electromechanical  transducer  comprising  (i)  the 
local  geology  on  which  a  seismic  wave  impinges,  (ii)  the  coupling  of  this  geology  or 
terrain  to  the  seismometer,  and  (iii)  the  seismometer  itself.  The  "sensor  response" 

(as  the  term  will  be  used  here)  is  the  combined  response  of  all  these  elements,  that  is, 
the  impulse  response  or  (frequency)  transfer  function  that  relates  the  signature  of  the 
wave  to  the  seismometer  electrical  output  of  the  sensor.  Considering  the  nature  of 
the  elements  (i)  and  (ii),  it  is  only  realistic  to  view  the  sensor  responses  as  unknown, 
at  least  to  some  degree,  and  also  to  be  a  function  of  wave  arrival  angle. 

The  measurement  of  each  sensor  response  is  at  best  difficult,  for  we  have  no 
control  over  the  waves  that  must  be  relied  on  to  "probe,"  nor  can  we  know  their 
signatures  in  detail.  The  actual  measurement  of  the  complete  sensor  response  is  the 
basic  "deconvolution"  problem,  which  we  will  not  discuss. 

If,  however,  we  address  the  more  modest  "equalization"  goal  of  obtaining 
identical  sensor  outputs  (in  the  absence  of  noise)  to  a  wave  arriving  at  a  known  angle 
without  explicitly  finding  the  sensor  responses,  then  there  is  reason  to  be  more  hope¬ 
ful.  Of  course,  even  should  identical  outputs  be  obtainable,  they  will  still  suffer  some 
distortion  of  the  wave  signature --but  this  should  not  unduly  disturb  trained  seismolo¬ 
gists  already  familiar  with  such  sensor  aberrations. 

Thus,  in  the  equalization  problem,  we  seek  to  determine  the  ratio  between  the 
complex -valued  transfer  functions  H^(UJ)  and  I-^O^)  of  a  pair  of  sensors  (this  is  equivalent 
to  solving  a  linked  pair  of  integral  equations  in  the  impulse  responses,  a  conceptually 
more  difficult  task),  and  then  to  construct  a  filter  having  the  transfer  ratio 
[or  H^^/H^uu)]  as  its  transfer  function.  Such  a  filter  (which  if  not  happening  to  be 

*  Seismic  equalization  has  also  been  under  study  by  Texas  Instruments/  In  addition, 
a  paper  by  the  late  Dr.  M.  J.  Levhi  pertains  to  this  problem. 
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physically  realizable  requires  only  that  some  delay  be  inserted  in  the  system)  when 
placed  in  tandem  with  the  second  sensor  will  convert  its  response  into  that  of  the  first 
(or  vice  versa).  In  this  note  we  confine  attention  to  the  measurement  of  H^uuJ/h^Cu) 
rather  than  its  realization,  and  deal  with  just  a  pair  of  sensors.  Should  there  be  more 
than  two,  we  would  equalize  pair -by -pair,  or  perhaps  equalize  each  with  respect  to 
the  sum  (previous  to  equalization)  of  all. 

To  perform  the  transfer -ratio  measurement  for  a  given  arrival  angle,  we  must 
rely  on  a  collection  of  responses  to  probing  waves  (geophysically  or  atomically  generated) 
that  are  of  largely  unknown  signature  but  that  are  known  to  have  arrived  at  that  angle. 
Provided  that  for  each  event  the  same  wave  signature  is  received  at  each  sensor  input 
(apart  from  a  fixed  difference  in  relative  amplitude)  and  that  enough  probing  energy  is 
accumulated  at  each  frequency  U),  relative  to  the  system  noise  and  to  the  number  of 
responses  in  the  collection,  the  measurement  can  be  accomplished  with  vanishingly 
small  rms  error.  If,  however,  the  first  condition  is  violated,  as  can  happen  for 
example  if  the  source  fault  plane  or  radiation  pattern  (at  a  given  frequency)  varies 
from  event  to  event  and  the  seismometers  are  far  apart,  then  equalization  is  probably 
unattainable.  By  means  of  the  statistical  analysis  contained  in  this  note  it  should  be 
possible,  through  the  setting  up  of  confidence  regions  on  the  estimates  of  the  transfer 
ratio  from  separate  collections  of  responses,  to  test  for  such  a  contingency. 

To  enjoy  the  convenience  of  working  in  the  frequency  domain,  we  use  as  the 
collection  of  sensor  outputs  the  Fourier  transforms  of  a  selected  set  of  seismometer 
output  waveform  sections.  The  question  arises  of  how  long  the  output  sections  or  ob¬ 
servations  should  be --too  short  an  observation  results  in  smearing  of  spectral  detail, 
while  one  that  is  too  long  contains  an  inordinate  amount  of  noise.  No  guide  on  this 
point  is  presently  available,  other  than  intuition. 

To  be  realistic,  we  must  presume  in  general  that  there  is  correlation  between 
the  noises  in  the  sensor  outputs,  and  even  that  the  correlation  may  vary  from  the  arrival 
time  of  one  wave  to  that  of  the  next  (sorting  as  to  arrival  angle  may  result  in  such 
infrequent  probing  that  the  noise  field  can  change  considerably).  If  not  known,  such 
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correlation  can  so  bias  the  measurement  of  H^/H^  as  to  render  it  void.  Fortunately, 
however,  it  is  wholly  reasonable  to  assume  that  the  correlations,  as  well  as  the  noise 
intensities  at  the  two  seismometer  outputs,  are  known  at  all  times.  This  of  course 
requires  additional  data -processing  beyond  that  implied  by  the  development  presented 
in  the  following  Sections,  but  hardly  more  than  is  already  employed  at  present  in 
sophisticated  array  work. 

We  now  proceed  with  a  statistical  analysis  of  the  problem  of  measuring  the 
ratio  between  a  pair  of  transfer  functions  when  noise  disturbances  are  present. 
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II.  THE  STATISTICAL  MODEL 


We  adopt  the  following  model  for  the  single -frequency  equalization  problem. 
(Equalization  is  to  be  accomplished  frequency-by-frequency. )  From  data  on  N  seismic 
events,  we  are  given  N  pairs  of  Fourier-spectral  observations  {  Y  ,  Y2.}  ,  known  to 
be  generated  by 


Yii  ■ 


H,X.  + 

1  l 


N 


li 


i  =  1 . N 


Y2i  ' 


H2Xt  + 


N 


2i 


(1) 


where  the  frequency  parameter  w  is  henceforth  left  implicit.  Here,  the  sensor  transfer - 
functions  Hj,H  and  the  event  excitations  (Fourier  transforms  of  the  wave  signatures) 
{X^}  are  unknown  complex  constants  (or  mathematical,  not  random,  variables)  and  the 
noise  disturbances  {N^.},  are  zero-mean,  complex  gaussian  variates,  taken  to 

be  independent  between  observation  pairs: 


N  N..  =  0  =  N*N,.  when  i  /  j;  for  k  =  1,2  and  l  =  1,2  (2) 

Our  task  is  to  form  an  estimate  of  the  transfer -function  ratio  H^/H^  and  to 
draw  confidence  regions  about  it,  so  that  an  effective  signal  equalizer  may  be  con¬ 
structed  with  the  aim  of  converting  the  mean  of  any  Y2^  into  that  of  the  corresponding 
Y^.  In  forming  the  estimate,  we  are  allowed  knowledge  only  of  the  {  Y^,  Y2^}  and 
the  noise  statistics;  in  addition  to  the  reasonable  assumption  (2)  ,  it  is  convenient 
to  presuppose  the  observations  to  have  been  normalized  (through  individual  weighting 
by  positive  real  constants  derived  from  the  known  noise  intensities)  so  that 

|Nki|2  =  2  [Re(Nk.)]2  =  2  [Im(Nk.)]2  =  1;  all  i,k  (3) 

Equation  (3)  implies  statistical  identity  and  independence  between  the  real  and 
imaginary  components  of  each  noise  variate,  conditions  that  in  practice  will  as  a  rule 
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be  met  quite  closely.  The  noise  normalization  both  simplifies  the  analysis  and  has  a 
valid  basis  in  the  well -accepted  notion  that  one  should  play  down  observations  known  to 
have  excessive  noise  and  accent  those  for  which  the  random  disturbances  are  small. 
Furthermore,  normalization  is  a  reversible  procedure,  and  no  statistical  "information" 
is  lost  in  such  an  operation  (in  fact,  the  method  of  maximum  likelihood  can  be  shown 
to  dictate  such  normalization). 

In  order  to  retain  the  basic  character  of  our  model  (1)  when  normalization  is 
applied,  however,  we  require  the  further,  and  perhaps  on  occasion  unrealistic,  assump¬ 
tion 


VIn^T  V  In^F  =  y>0,  a  constant  independent  of  i  (4) 

If  y  =  1,  the  nature  of  the  {X^}  as  unknowns  leaves  (1)  quite  unaltered  after 
normalization;  otherwise,  the  estimate  of  Hj/I^  in  the  original,  unnormalized  situation 
is  given  by  estimating  Hj/I-^  from  the  normalized  data  and  then  multiplying  by  Y* 
[Should  (4)  be  violated,  our  results  would  no  longer  apply,  but  an  appropriate 
generalization  of  the  analysis  no  doubt  exists.  ] 

Finally,  we  permit  noise  correlation  to  exist  within  any  observation  pair.  The 
correlation  must  be  known  but  may  vary  from  pair  to  pair.  This  enables  most  seismic 
situations  to  be  modeled  realistically: 

NnNj.  =  p.  ;  |Pi|<  1  (5a) 

We  assume,  however,  that  for  all  i 

N  N  =  0  (5b) 

li  2i 

That  no  complex  correlation  coefficient  can  exceed  unity  in  magnitude  may  be  shown 
through  the  Schwarz  inequality.  The  condition  (5b)  will  be  nearly  true  in  most 
measurement  situations. 
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Except  that  complex  quantities  are  involved,  our  problem  is  not  new; 
investigations  of  this  type  have  been  the  subject  of  considerable  post-war  research 
among  statisticians.  Chapter  29  of  Kendall  and  Stuart^  furnishes  an  excellent  intro¬ 
duction  to  this  class  of  problems,  showing  that  regression  analysis  constitutes  a 
restricted  subclass,  and  noting  that  the  motivation  and  applications  usually  relate  to 
the  natural  sciences  and  econometrics.  (I  am  indebted  to  Dr.  Max  Halperin  for  this 
reference,  which  although  treating  a  scalar  rather  than  complex-valued  situation, 
parallels  and  confirms  my  independent  efforts  at  a  number  of  points. ) 

In  Kendall  and  Stuart  ("K/S")  terms,  our  problem  is  one  of  estimation  within 
a  "functional  relation,”  where  this  relation  is 

H1X1  =  <VH2>H2Xi  <6> 

and  Hj/H^  is  to  be  estimated  when  only  randomly-perturbed  versions  of  the  {H^X^} 
and  {H^X^}  are  observable,  as  in  (1).  In  K/S,  the  scalar  equivalents  of  the  {H^X.} 
and  {H^X^}  are  viewed  either  as  non-random  unknowns  or  as  gaussian  variates  of 
unknown  statistics,  but  both  approaches  yield  similar  results--in  the  seismic  context 
it  is  probablymore  realistic  to  take  the  former  view.  Also  in  K/S,  the  statistics  of 
the  gaussian  disturbances  {N^.} ,  {N^}  are  considered  unknown  (but  not  i -dependent), 
and  required  to  be  estimated  along  with  H^/H^  (again,  however,  we  are  interpreting 
their  scalar  results  in  terms  of  our  notation).  We  assume,  however,  that  these  statistics 
are  available  (as  is  reasonable  in  the  seismic  context)  even  though  the  similarity  be¬ 
tween  our  results  and  those  of  K/S  implies  that  this  additional  information  may  not 
actually  be  needed  in  formulating  an  estimator.  (However,  the  confidence  regions 
would  be  expected  to  differ  according  to  whether  the  noise  statistics  are  known  or 
must  be  simultaneously  estimated,  and  in  fact  we  propose  a  different  confidence  pro¬ 
cedure  from  that  of  K/S,  one  that  draws  on  our  presumed  additional  knowledge. ) 
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III.  AD  HOC  ESTIMATION 


In  first  undertaking  this  study,  we  considered  estimators  of  H^/H^  that  were 
quite  ad  hoc,  but  which  were  completely  explicit  and  fairly  amenable  to  the  statistical 
analysis  required  for  setting  confidence  limits.  We  now  present  these  early  results, 
deferring  to  Section  IV  the  generally  more  implicit,  but  often  asymptotically  more 
efficient,  estimation  procedure  prescribed  by  the  method  of  maximum  likelihood. 

Let  us  begin  by  momentarily  imagining  that  the  variances  of  the  components  of 
the  {Nj-}  ,  are  zero>  s0  that  these  variates  vanish  in  (1),  but  that  H^, 

and  the  {X^}  remain  unknown.  Then  simply  by  forming  the  ratio  Y^/Y^.,  for  any  i 
for  which  ^  0,  the  transfer  ratio  is  immediately  and  exactly  determined, 

since  cancels  out.  By  this  token,  one  next  might  attempt  to  average  out  the  noise 
that  is  actually  present  through  the  estimator 


2i 


(7) 


True,  the  ratio  between  the  numerator  and  denominator  means  is  again  exactly 
but  such  an  estimate  cannot  be  expected  to  do  well  in  general.  This  is  so  because 
normally  (at  least  in  seismology)  the  X^,  as  vectors,  will  lie  in  no  constructive  rela¬ 
tionship,  with  the  result  that  near -cancellation  of  the  vector-sum  mean  component  may 
occur  in  the  numerator  or  the  denominator  of  (7).  When  this  happens,  the  estimate 
will  be  at  the  mercy  of  the  noise. 

What  seems  to  be  needed  is  an  estimator  whose  performance  depends  on  the 
magnitudes  of  the  {X^},  and  not  on  their  phase  angles.  This  suggests  that,  while 
adhering  to  a  ratio  scheme  so  that  the  influence  of  the  {  X^}  on  the  mean  value  of  the 
estimate  of  H^/H^  may  still  be  suppressed,  we  involve  the  observables  in  the  numerator, 
and  in  the  denominator,  in  a  nonlinear  way.  The  use  of  quadratics  is  particularly 
attractive,  for  much  is  known  about  the  statistics  of  quadratic  forms  in  gaussian 
variates,  and  we  must  look  ahead  to  the  need  to  obtain  the  estimate  statistics,  in  order 
to  determine  confidence  regions.  There  are  a  number  of  ways  in  which  suitable 
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quadratics  in  the  {  Y^,  may  be  formed,  and  selection  among  them  would  at  first 
seem  to  be  largely  a  matter  of  taste.  (Later,  we  shall  show  that  maximum -likelihood 
estimation  prescribes  a  particular  selection  of  quadratics,  and  that  the  choice  of 
quadratic -type  nonlinearities  themselves  is  no  longer  ad  hoc  in  this  context. ) 

We  have  taken  the  following  path.  First,  we  write  the  quantity  to  be  estimated 
in  polar  form : 

=  Rej9  ;  R  >  0  (8) 


since  there  appears  to  be  no  preference  a  priori  for  estimating  ratlier  1:11311 

H2/H a ,  and  the  polar  form  lends  itself  naturally  to  reciprocation.  Next,  we  consider 
the  estimation  of  R  separately  from  that  of  estimating  9  [either  estimate  will  be  seen 
to  have  a  pleasing  symmetry  in  the  observables  {  Y  }  ,{  Y,^}  by  virtue  of  the  repre¬ 
sentation  (8).  ]  Our  ad  hoc  estimate*  of  R  is: 


and  of  9: 


R  = 


E  (lYiJ2-!) 

i=l 

E  dv2il2- 1) 

i=l 


1/2 


9  =  tan 


[l  <viiY2*r  pi>"| 

Re  [  I ,(YHY2r  Pi>1 

—  1=1  — 


(9) 


(10) 


with  the  quadrant  assigned  to  9  according  to  the  sign  of  either  the  numerator  or 
denominator  of  (10). 


*  It  is  interesting  that  the  ad  hoc  estimator  (9)  is  quite  similar  to  a  maximum - 
likelihood  estimator  (29.  32)  of  K/S. 
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That  both  (9)  and  (10)  are  asymptotically  unbiased  and  consistent  estimates 
of  R  and  0,  respectively,  may  be  demonstrated  by  considering  the  typical  term 
(Y^Y^i—  pp.  [The  terms  in  (9)  may  be  viewed  as  special  cases,  in  which  p.  =  1  and 
replaces  in  (1 ) ,  or  vice  versa.  ]  We  may  write  the  term  as 


Y,.Y*  —  p.  =  y  |Y..  + Y_. 
li  2i  l  4  1  li  2i 


f--  |y  - 

4  1  li 


Y9-|2+|IYt  +  jY0.|-f|Y  -jY  |-  p.  (n) 
2i  4  1  li  J  2i 1  4  1  li  J  2i 1  l 


and  then  examine  a  single  squared  magnitude  at  a  time.  Since  later  we  will  have  need 

of  a  more  general  statistical  treatment,  let  us  actually  find  the  mean  value  of  a 
2  2 

product  |Y  |  |Y2|  formed  on  the  new  variates 


Y1  =  Ml+Nll 
y2  =  m2+n2  ) 


(12) 


where  and  are  given  complex  means  and  are  zero-mean  complex 

gaussian  variates  satisfying  the  conditions: 


2  [Re(Nk)]2  =  2[Im(Nk)]2  = 


N1N2*  =  *3oaia2  ;  N1N2  =  ° 


;  k  =  1,2 


(13) 


If  we  set  Ml  =  M2,  CT1  =  or  and  let  Pq  =  0,  we  have  |YX|2  |Y2|2  =  |Y2|2  =  [  |YJ  l2)2, 

while  if  PQ  =  1,  |YXP  |Y2|2  =  lYj4  =  variance  (lY^2)  +  [  |Y:  I2  ]  2  .  Thus  we  may 
study  the  mean  and  fluctuation  of  any  of  the  squared  magnitudes  in  (11)  by  taking  the 
following  general  result  (14) ,  suitably  identifying  =  M2  with  HjXi  and  H2X^ 
through  (1) ,  setting  the  value  of  =  cr2  by  reference  to  (1)  through  (5b) ,  and 

then  either  letting  PQ=0  or  pQ=  1. 

The  general  result  is 
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|Y,|2  |Y2|2  =  l<m1R+n1R)2  +  <niii+nii)2l  I<ra2R+n2R)2+<m2I+n2I)21 


(m2lR+mn+nSlR+"3lI,(m2R+m2I+r,2R+n2I) 


+  <nS1R+I'S1I)  ([,S2R+n2I*  <niR+IIV  <n2R+‘T2p 


+  4  (m  ir"  ir  +  m  llBlI)  (m2Rn2R  + 


=  (Im^  +0^)  (|M2I2  +  o22)  +  oS1  ^2  IPq^2  Gia2  <14> 

where  m  ,  m  and  n  ,n  are  the  real  and  imaginary  parts  of  M  and  N  ,  k  =  1,2, 

KiV  K1  Kiv  K1  iv  K 

respectively,  and  we  have  used  [as  a  corollary  of  (13)] , 


nlRn2R  nHn2I 


ala2 

-T2  Re(po> 


nHn2R 


ala2 


(15) 


niRn2I  2 


Im  (pQ) 


Also,  we  have  invoked  the  fourth-moment  relation  for  zero-mean  gaussian  random 
variables: 


nin2n3n4  nin2  n3n4  +nin3  n2n4  +  nin4  ’  n2n3 


(16) 


Upon  employing  (14)  in  (11)  with  pQ  =  0,  M ^  =  M2,  =  o  t  we  obtain  for 

the  mean: 


YliY2*r  Pi  -  ?  IXil2[|Hl+H2|z-lHrH2l;i+jlH1+jH2|-1-jlH1-jH2fi) 


+  ±{[1  +  Re(p.)]  -  [  1  —  Re(Pj)]  +  J[  1  +  Im(p  )]  -  j[  1  -  Im(p.)  ]  } -p 


=  |XJ2  H,H* 


(17) 
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and  with  pQ  =  1  we  find  for  the  variance  of  each  of  the  squared  magnitudes  appearing 
in  (11)  a  function  of  the  form  C  +  D  IxJ2,  where  C  and  D  depend  on  and  p^. 

Therefore,  the  variances  of  both  the  real  and  imaginary  parts  of  (11)  are  upper- 
bounded  by  functions  of  this  form,  and  further  study  shows  that  weaker  but  similar 
bounds  can  be  obtained  that  do  not  involve  p^.  Thus,  the  ratio  of  the  variance  of  either 
the  real  or  imaginary  component  of 

Z  Or^-P,) 

1=1 


to  the  squared  mean  of  either  component  is  upper -bounded  by  a  function  of  the  form 


N 


en  +  f  Yj  lx,  I" 

i=l 


N 

Z  IXjl 

i=l 


(18) 


where  E  and  F  depend  only  on  and  Hence  so  long  as 

N'1/2  Z  W2  (19) 

i=l 

tends  to  infinity  as  N”  “(it  can  be  demonstrated  that  this  is  in  general  also  a  necessary 
condition),  the  ratio  in  (10)  will  converge  (probabilistically)  to  Irr^H^Hp/Re  (H^H*), 
so  that  this  estimate  of  phase  angle  is  asymptotically  unbiased  and  consistent.  (A  little 
further  work  establishes  that,  asymptotically,  the  sign  of  the  numerator  or  denominator 
will  lie  in  the  correct  quadrant  with  probability  one. ) 

By  letting  H  be  replaced  by  and  setting  p^  =  1,  we  respectively  force  both 
the  signals  and  the  noises  in  (1)  to  be  identical,  so  that  Y  =  Y  .  Then  by  (17), 

XX 

lY li  |2  —  1  =  lx.l2  Ih^2  (20) 
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estimate  (10)  now  shows  that  if  again  (19)  tends  to  infinity  with  growing  N,  (9) 


is  an  asymptotically  unbiased  and  consistent  estimator  of  R  =  Thus,  the 


ad  hoc  estimators  (9)  and  (10)  together  provide  an  asymptotically  correct  measure¬ 


ment  of  the  transfer -function  ratio 


Let  us  next  consider  the  problem  of  obtaining  the  statistics  of  (9)  or  (10) 

more  precisely,  now  letting  H  ,  H  and  the  {X.}  be  known.  As  mentioned  before, 

L  2  i  ~  i9 

such  analysis  is  needed  in  locating  the  confidence  region  about  the  estimate  R  eJ  of 

[  We  consider  (9)  and  (10)  separately,  although  in  principle  the  confidence 

region  should  be  founded  on  their  joint  statistics.  ]  The  chief  difficulty  here  is  that  of 

statistical  dependence  between  numerator  and  denominator,  which  persists  in  (10) 

even  when  all  the  correlation  coefficients  {p.}  vanish  so  that  the  {Y,.,  Y~.}  become 

L  v  1  li  2i 

independent  within  as  well  as  between  pairs. 

The  first  steps  in  handling  the  dependence  are  to  write  the  cumulative  probability 
function  for  the  ratio  in  question  in  the  form 


Pr  (  ^-<  x)  =  Pr  (n-xd  <0,  d>0)  +  Pr  (n-xd  >  0,  d  <  0);  [Pr(d  =  0)  =  0] 


(21) 


and  then  apply  the  bounds 


^  Pr  (n-xd  <  0,  d  >  0)  ^ 


Pr  (n-xd  <  0) 


Pr (d  >  0) 


and  similarly  for  Pr  (n-xd  >  0,  d  <  0).  This  yields 


P (n -xd  <  0)  -  Pr  (d  <  0)  ^  Pr  (  ^<  x)  <  Pr  (n-xd  <  0)  +  Pr  (d  <  0) 


(22) 
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and  if  attention  is  limited*  to  those  situations  in  which  (9)  affords  a  reasonably  good 
estimate  of  the  magnitude  R  of  H^/H^,  Pr  (d  <  0)  will  be  entirely  negligible.  Turning 
to  (10)  ,  good  estimation  conditions  will  not  necessarily  cause  the  denominator  in 
the  arctangent  argument  greatly  to  favor  one  sign,  but  should  it  not,  then  the  numerator 
definitely  will --therefore,  in  such  circumstances  the  reciprocal  of  the  argument  can  be 
tightly  bounded**  and  this  is  equally  satisfactory. 

With  Pr  (  ^  ^  x)  thus  closely  approximated  by  Pr  (n-xd  <  0),  a  pair  of  linear 
transformations  can  be  applied  to  the  gaussian  variates  that  appear  quadratically  in 
(n-xd),  to  yield  a  new  expression  identical  in  sign  to  (n-xd),  but  in  which  all  the 
quadratics  are  now  mutually  independent.  Proceeding  in  this  manner  for  (9)  (hence¬ 
forth  overlooking  the  approximation,  and  ignoring  the  magnitude  signs  in  view  of  the 
virtual  certainty  that  both  numerator  and  denominator  will  be  positive  under  good 
estimation  conditions),  we  find 


N 


Pr  (R<  r)  =  Pr{  £  [  jzj  -  |Z2.|2)  /  [8Re(nR.)]  -  N(l-r3)<  0} 


i=l 


Ri 


(23) 


where 


Zli  =  Yli<1+URi>  +  r<1“|JRi)Y2i 


Z2i  ■ 


(24) 


and 


^Ri  = 


2jr  Im(p.)  +J( I  +  r2)3-4r3  |p.P 
1  +  r2  -  2r  Re(p.) 


(25) 


*  In  discussing  their  estimate  (29.32),  K/S  cite  the  "inescapable  difficulty"  met  with 
estimators  of  the  type  (9)  —  (10)  when  Pr  (d  <  0)  /  0. 

**  Here,  Pr  (d  >  0)  may  just  as  likely  be  the  negligible  quantity,  rather  than  Pr  (d  <  0); 
in  essence  we  are  noting  that  under  good  estimation  conditions  there  is  virtually  no 
chance  of  wrongly  guessing  the  pair  (or  pairs)  of  adjacent  quadrants  in  which  the  phase 
estimate  will  lie.  K/S  argue  similarly  in  their  Section  29.  21. 
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For  any  i,  it  maybe  verified  through  (1)  ,  (3)  ,  and  (5)  that  Z^Z*^  —  (Zj.  '  Z*^) 

=  0  =  Z^2^  ~  (T-n'  ^2^);  this  establishes  the  independence  of  and  Z2j,  since 
they  are  jointly  gaussian  variates.  Since  the  real  and  imaginary  parts  of  Z^  or  Z2^ 
are  independent  and  of  equal  variance,  the  {  |Zj-P,  constitute  a  set  of  mutually 

independent  non-central  chi-square  variates,  each  having  two  degrees  of  freedom. 

It  can  be  shown  that  if  the  noise  correlations  {  p.}  are  all  equal  in  magnitude 
(|p.  |  =  Ip  I  for  a  condition  that  may  not  always  apply  in  seismic  equalization),  the 

component  variances  of  the  {Z^}  are  all  equal,  and  likewise  for  the  {Z2j}.  The  sum 

N  2 

Zj  -  Z  |ZUI  /ISReft^j)] 

i=l 

then  has  a  non-central  chi-square  distribution  of  2N  degrees  of  freedom,  with 
probability  density  given  by 

pl(Zi>  =  a"x2  (X2/X1)(N"1)/2  exp  [-  (X2Ax)/2]  1^(7^);  Zx>  0  (26) 


where  1^  ^(z)  is  the  modified  Bessel  function  of  argument  z  and  order  (N-l). 

=  Zi/ai  and 


Here, 


4  a2  =  s  +  l-  r2;  s  =  ^(l+r8)2  -  4p  |pp 


(27) 


N 

Z  lx.l3  [  |H1hl+r3+s)  +  r2|H2|2(l+r2-s)-4r2  Refl^H-p*)] 

X  =  - -  (28) 

s2  +  s(l  —  r2) 

A  like  result  [the  subscripts  in  (26)  changing  from  "1”  to  ”2"]  obtains  for  the 
probability  density  of 

N 

Z2  =  Z  lZ2ll  /l8Re(uR.)] 

i=l 

with 
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(29) 


4  a2  =  s-l+r2 
2 

N  2 

E  lxil  MhJ  (l+r2-s)+r2|H2|  (l+r^+s)  -  4r2Re  (H^*  p*)  ] 

=  — -  (30) 

s2  +  s  (1  —  r2) 


Thus, 

E9  +N(l-r2) 

Pr  (R<  r)  =  J  P2(E2)  dE2  J*  <31> 

0^0 

and  the  statistics  of  the  estimate  (9)  of  |H^/H2  |  are  available  through  a  double  integral 
involving  a  pair  of  Bessel  functions  in  the  integrand.  It  does  not  appear  possible  to 
perform  the  integration  analytically.  Incidentally,  when  all  the  {  p^}  vanish,  n  and  d 
in  (21)  involve  quadratic  variates  that  are  independent,  and  it  then  happens  that 
Pr  (R  <  r)  can  be  evaluated  precisely  through  a  combination  of  double  integrations  like 
that  of  (3 1)  where  the  double  inequalities  in  the  probability  statements  of  (2 1)  are 
met  through  suitable  choice  of  the  integration  limits.  This  affords  a  numerical  check 
of  the  approximate,  asymptotic  result  (38)  that  is  given  later. 

The  statistics  of  the  phase  estimate  0  of  (10)  may  be  closely  approximated  in 
a  manner  paralleling  that  just  developed  for  R.  In  this  way  we  find,  when  p^  =  p  for  all 
i  (equality  of  correlation  magnitudes  is  no  longer  sufficient),  and  ignoring  the 
approximation, 

«  _  _  E4+Ntim(p)-tRe(p)] 

Pr  (tan  0  <  t)  =  J  p  (Z4>  dZ4  J  P 

0  0 


A  d-3  (32> 


if  it  is  virtually  certain  that 
Re 


\  ?,  (YiiY2*r  p> \ 

'  i=l  ' 


>  0  . 


When  the  opposite  is  virtually  certain,  (32)  equals  Pr  (tan  0  >  t);  if  neither  is  certain 
because  the  angle  0  of  H^/H2  lies  near  (tt/2)  or  —  (tt/2),  then  we  deal  with  ctn  0  whose 
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denominator  is  of  virtually  certain  sign,  and  obtain  a  similar  expression  to  (32).  In 
(32)  pg  and  p4  are  given  by  (26)  as  re -subscripted,  where 


4a®  =  u  -  Re  [ O'  +t) P  ]  ;  u  =  J l  +  s®-{lm[0+t)P]}3 


4a®  =  u  +  Re  [0+s)p  ] 


N 


[  E  lXil  1  [(l?  +  l)(|H1|  +|H2|  )-2Im[0-rt)H1H*]Im[0-rt)p)-2uRe[04t)H1H*]l 


= 


i*l 


N 


4u®  —  4u  Re  [  0+t)  P  ] 

,2  i„  i2. 


[  E  lXjl  H (!?+!)( iHj'  +|H2I  )-2Im[04t)H1H*]  Im[0-rt)p]  +  2uRe(0-rt)H1H 
i«] 


4u®  +  4u  Re  [  0+t)  p  ] 

(If  all  the  {  p.}  vanish,  (32)  may  be  evaluated  analytically  through  Price.4)  Equations 
(32)  and  (33)  follow  from  the  relation,  valid  when  the  denominator  of  the  arctangent 
argument  in  (10)  is  certain  to  be  positive, 

N  N 

Pr(tan  0  <  t)  =  Pr{  E  t  |Zg.|  "  |Z4in  /  [8Re(n0.)]  -  E  Um(p.)  -  tRe(p.)]  <  0}  (34) 

i=l  i=l 


J 


(33) 


Here  the  (Z  ,  Z  }  are  mutually  independent  gaussian  variates  given  by 

ul  Til 


Z3i  =  <1+V  +  1Y2i  +  (1f>  Yli] 

Z41  ■  |Y2i"(iT>  V  (*-^)+iY2i  +  <1r>Yui<1  +  ^) 


ritti 


(35) 


where 


9i 


-  4  j  Im  [  0+t)  P^  +  4  V  1  +t®  -{Im  [O+Op^f 

1  +  t® 


(36) 


16 


Thus,  when  all  the  noise  correlations  {  p.}  are  equal  and  good  estimation 


conditions  exist,  we  are  able  to  obtain  from  (26)  through  (33)  quite  accurate  results 
for  the  statistics  of  the  estimated  magnitude  and  phase  of  the  transfer-function  ratio 
Using  these  statistics,  confidence  regions  can  be  set  up  about  the  estimate  of 
H1/H2*  anc*  by  treating  different  sets  of  observation -pairs  the  validity  of  the  model 
(  1)  itself  can  be  quantitatively  tested.  It  is  interesting  to  note  that,  as  desired,  the 
phase  angles  of  the  {X^}  never  affect  the  performance  of  the  estimate;  moreover,  when 
all  the  {  p.}  are  equal,  the  performance  does  not  depend  on  the  individual  {  |x.  1}  but 
only  on  the  total  signal  "energy”  in  the  observations 


(37) 


2  2 

as  multiplied  by  either  one  of  the  energy  gains  |H^|  ,  |H^  |  •  This  is  fortunate,  for 


it  leaves  just  a  single  unspecified  parameter  in  the  determination  of  the  confidence 
region  (assuming  that  a  confidence  percentage  has  been  assigned  and  that  there  is  no 
question  of  how  this  probability  should  be  distributed  in  the  excluded  region-  -see 
Chapter  20  of  K/S).  Increasing  this  parameter  E  (presumably)  shrinks  the  confidence 
region,  and  E  can  probably  be  conservatively  estimated  through  the  numerator  or 
denominator  of  (9) ,  or  perhaps  both  together. 

A  complicating  factor  in  determining  the  confidence  region  is  that  one  does  not 
have  the  joint  statistics  of  R  and  0  (they  are  generally  dependent,  even  for  asymp¬ 
totically  large  N  )  and  yet  in  general  their  individual  statistics  depend  on  both  true 
values  R  and  0.  This  problem  needs  further  examination  (or  recourse  to  a  search  of 
the  literature),  but  the  following  expedient  seems  reasonable.  First,  for  the  estimate 
R  given  by  (9)  find  the  confidence  intervals  for  R  corresponding  to  all  values  of  the 
true  parameter  0,  using  (26)  through  (31)  (if  all  the  {  p^}  vanish,  there  will  be  no 
dependence  on  0).  Then  do  the  converse  for  the  estimate  0  given  by  ( 10)  ,  using 
(32)  —  (33).  In  this  manner  we  obtain  two  regions,  both  in  R  and  0.  The  intersection 
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of  these  two  regions  may  be  taken  to  be  the  final  confidence  region,  with  a  confidence 
of  the  order  of  whatever  common  percentage  was  adopted  in  setting  up  all  of  the  con¬ 
fidence  intervals. 

It  seems  plausible  that  as  N-*®,  the  quadratic  sums  in  the  right  members  of 
(23)  and  (34)  will  become  gaussianly  distributed.  At  least,  this  occurs  if  all  the 
noise  correlations  {  p.}  are  equal,  for  then  the  component  variances  of  the  Z^.,  Z 
Z^i  and  Z do  not  depend  on  i,  and  the  |xj^  enter  into  the  statistics  of  the  sums 
only  through  their  sum.  Thus  when  the  latter  sum  is  imagined  to  be  reapportioned 
equally  among  all  i,  the  Central  Limit  Theorem  applies  to  each  of  the  quadratic  sums. 
The  range  of  the  {  p.}  being  restricted  (|p.  |  ^  1),  one  feels  that  N  -♦ 00  implies  sufficient 
"bunching”  of  the  {  p^}  that  the  Central  Limit  Theorem  is  still  effective,  although  this 
certainly  remains  to  be  proven. 

If,  furthermore,  estimation  conditions  improve  as  N-®,  as  they  will  if  the 
total  signal  energy  E  of  (37)  grows  faster  than  a/FF,  then  in  (23)  and  (34)  the  local 
linearity  of  the  right  members  in  r  and  tan"!  t,  respectively,  implies  that  Rand  0  will 
themselves  be  asymptotically  gaussian  (with  means  equaling  the  true  values  R  and  8). 
Here,  we  are  drawing  on  the  asymptotic  consistency  and  lack  of  bias  established 
earlier  for  (9)  and  (10).  It  is  of  interest  to  determine  the  variances  of  Rand  0  in 
the  asymptotic  situation,  for  if  they  are  asymptotically  normal  these  are  all  that  we 
need  in  order  to  draw  a  confidence  region.  By  employing  (  14  )  with  Pq=  0  or  1  in 
(23)  and  (34)  ,  some  manipulation  shows  that  the  asymptotic  variances  are  given  by 


N 


Z  |x  \z [R+R1—  2 |p  | cos  (0  —  0  ) ]  2  [R3+R:2-2|pi|2] 

i=l  .  i=l 


N 


i 

R2 


2|Hjl  Ih2I  & 


4  |H  j  |2  |H2|2 


(38) 


N 


N 


o2  = 


Z  |X.|2  [R+R'1-2|p.|cos  (0-0.)]  Z  [l-|Pi|2+2|p.|cos2(0-0.)] 
i=l  i=l 


2  IhJ  |h2|b5 


2  J2 


2  lHir  lH2r  E 


(39) 
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where  p.  =  |p.  |  exp  [  j 0^]  and  E  is  given  by  (37).  [That  the  asymptotic  means  are  R 
and  0  is  verified  concurrently.  These  mean -and -variance  results  can  no  doubt  be 
derived  directly  from  (9)  and  (10)  and  the  assumption  of  improving  estimation  con 
ditions  (E  -» °°),  without  any  appeal  to  asymptotic  normality,  and  perhaps  even  without 
requiring  N  -  ] 
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IV.  Maximum -Likelihood  ("ML")  Estimation 

We  now  take  an  entirely  synthetic  rather  than  ad  hoc  approach  to  the  problem 
of  estimating  the  transfer -function  ratio  from  the  given  complex -valued  observa¬ 
tion  pairs  {  Y^. ,  ,  i  =  1, . . .  ,N,  generated  as  in  ( 1 ) .  Setting  up  the  likelihood 

function,  i.e.  ,  the  probability  density  function  of  the  {Y^.,  Y  },  we  proceed  to  choose 
the  unknowns  and  {X^}  so  that  this  function  achieves  its  maximum  value,  and 

take  the  ratio  thus  found  as  our  estimate  Of  course,  the  notion  that  it 

is  good  to  maximize  the  likelihood  is  ad  hoc  in  the  first  place.  Furthermore,  we  shall 
see  that  this  estimate  is  in  general  far  less  explicit  than  (9)  —  (10)  and  even  when 
explicit,  its  statistics  are  usually  difficult  to  derive.  When  estimation  conditions  are 
sufficiently  good ,  however,  the  ML  estimate  can  be  definitely  superior  to  the  ad  hoc 
estimate  provided  by  (9)  —  (10)  and  thus  certainly  merits  our  attention. 

We  begin  as  before  by  assuming  the  conditions  (2)  ,  (3)  ,  and  (5)  .  Should 
the  noise  intensities  initially  differ  from  observation  to  observation,  but  (4)  be 
satisfied,  it  is  not  difficult  to  show  that  the  ML  method  dictates  the  noise  normalization 
as  given.  Rather  than  deal  with  the  { Y^,  Y  }  directly,  which  are  usually  correlated, 
we  may  equivalently  and  more  conveniently  maximize  the  likelihood  of  the  linearly 
transformed  (and  still  gaussian)  variates 

Yli  =  [Yli  +  (^1_  Pii-ipii)  Y2i]  /^1-Pli  +pRi 

(40) 

r2i  -  K-^i +  JPn>  Yn  +  Y2i]  /  4-  pRi  4 


where 


P. 

l 


C,Ri+jPIi 


(41) 


is  the  known  noise  correlation  coefficient  for  the  i^1  observation  pair.  A  little  calcula¬ 
tion  now  shows  that  the  {Y^.,  Y^.}  are  mutually  independent,  and  have  real  and  imaginary 
components  all  of  equal  variance. 
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Upon  finding  the  means  of  the  {  Yj. ,  Y^. }  from  ( 1 )  »  we  determine  that  the 
likelihood  is  maximized  when  we  minimize  the  quantity 


N  l[Y11+Y2t<ynr-iPli)]  -xi[H1  +  H2<vT^;-jpI1)i2 
i=l 


1  -  ^li  +  PRi 


+  N  ll  Yu(-^i^  +  iPu)  +  Y^l  -  XllHl(-7l-  f  ]PU)  +  H21  | 

i=l 


This  is  most  conveniently  done  term-by-term,  maximizing  on  the  {X.}  while  at  first 
holding  Hj  and  H  fixed.  Expanding  the  squared  magnitudes,  it  is  seen  by  inspection 
that  for  |X  |  given,  the  minimizing  angle  of  X^  is  that  of  [HJY^+H*Y2  -p^  (HJY  +H*Yj  ) 
—  2jp^(H*Y2^- H*Yii)  ] .  The  quadratic  in  |xj  obtained  upon  substituting  this  result 
back  in  the  tfh  term  of  the  expanded  version  of  (42)  may  now  be  straightforwardly 
minimized  by  differentiation,  whereupon  our  ML  estimate  of  X^  is  found  to  be 


X. 

l 


H;Yii+H2Y2i-pjHtY2i-prH;Yi 

|H1|2  +  |H2|2-2Re(H1H*p*) 


(43) 


With  (43)  substituted  in  (42)  we  determine,  after  considerable  algebra, 
that  now 


|  lYli-<VH2>Y2i|2 

i=1  1  +  |H2/H2 12  -  2  Re  [  (H2/H2)  p*] 


(44) 


is  to  be  minimized  with  respect  to  and  H2-  Note,  however,  that  (44)  involves 
and  H2  only  through  the  very  parameter  that  we  seek  to  estimate.  In  general, 


21 


we  must  stop  here,  leaving  it  to  a  computer  to  perform  the  minimization  of  (44)  with 
respect  to  the  magnitude  and  angle  of  H  /H  .  Even  when  all  the  {  p.}  are  equal  but 

i.  Z  1 

non-zero,  it  does  not  seem  possible  to  proceed  explicitly  beyond  the  maximization  with 
respect  to  magnitude  for  a  given  angle,  or  vice  versa. 

When  the  {  p^}  all  vanish,  however,  we  find  that  the  minimizing  angle  9  is  that 


so  that  in  this  case  the  ad  hoc  (10)  and  ML  estimates  are  identical.  On  the  other 
hand,  the  ML  estimate  of  the  transfer -function  magnitude  is  found  to  be,  for  vanishing 
noise  correlations, 


which  is  quite  in  contrast  to  (9)  ,  even  though  quadratics  in  the  { Y^f  Y  }  are  ^nv0^ve^ 
in  both.  [Equation  (45)  does  not  appear  amenable  to  statistical  analysis  except  when 


can  with  high  probability  be  closely  approximated  by  Re  { 


where  0  is  the  angle  of  H^/H^.  ]  It  is  interesting  to  compare  (45)  with  the  similar 
result  (29.  29)  that  K/S  obtain  in  the  scalar  analog  to  ( 1 ) . 

The  final  goal  of  this  analysis  is  to  study  the  asymptotic  behavior  of  the  estimate 
Hj/H2  that  maximizes  (44  )  ,  and  to  compare  it  with  that  of  the  ad  hoc  estimate  pro¬ 
vided  by  (9)  —  (10)  .  It  is  fortunate  that  even  though  the  ML  estimate  itself  must 
in  general  be  found  through  trial -and -error,  its  asymptotic  statistics  can  be  determined 
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quite  explicitly.  Our  procedure  is  to  evaluate  the  second  derivative  of  the  mean  value 
of  (44),  taken  with  respect  to  the  logarithm  of  the  magnitude  (or  the  phase)  of  Hj/H2 
at  the  true  value  of  H^/H^  (where  the  mean  of  (44)  must  asymptotically  have  its 
maximum),  and  divide  the  square  of  this  derivative  into  the  variance  of  the  companion 
first  derivative.  The  result  is  the  asymptotic  variance  of  the  log -magnitude  (or  phase) 
estimate. 

Performing  this  analysis  for  the  log -magnitude  8  =  logR,  where  R  is  given  by 
(8)  we  find  by  applying  (14)  to  the  derivatives  of  (44)  and  doing  considerable 
algebra  that  the  asymptotic  variance  of  the  ML  estimate  is  (since  8=8) 


(8  —  3)2 


JL 

2 


~N  |x.|2  |hJ  |h2| 

i^l  R +r"1-2|p.|  cos  (6-e.) 

'  l'  I' 


1 

N  1  —  |p.  |2 

V  '  i 

n  |X.|2  IhJ  |h2| 

+  2 

i=l  [R  +  r"1-2|p.|cos(0- 0.)]2 

-111 

i=l  R+R  —  2|p^|  cos  (0  —  0^) 

(46) 

where  again  0.  is  the  angle  of  the  ith  noise  correlation  coefficient.  Upon  carrying 

A 

through  the  like  analysis  for  the  asymptotic  variance  of  the  ML  phase  estimate  0  ,  we 
obtain  identically  the  same  result  as  (46)  ,  which  is  quite  pleasing,  and  to  be  con¬ 
trasted  with  >  (38)  and  (39)  for  the  ad  hoc  estimation.  Moreover,  the  magnitude 
and  phase  errors  of  the  ad  hoc  estimates  are  generally  found  to  be  coupled,  whereas 
further  analysis  shows  that  the  first  derivative  of  (44 )  with  respect  to  8  is  asymp¬ 
totically  uncorrelated  with  that  with  respect  to  0,  when  both  are  taken  at  the  true  value 
of  Hj/H2  =  exp  (8+j0). 

Thus,  the  errors  in  the  ML  estimate  are  asymptotically  uncorrelated,  and  since 
the  estimate  should  have  gaussian  statistics  asymptotically  (we  beg  this  question),  the 
errors  are  asymptotically  independent.  An  asymptotically  circular  confidence  region, 
centered  on  the  estimate,  can  therefore  be  drawn  in  the  plane  whose  rectangular 
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coordinates  are  (8,9),  where  the  radius  is  determined  by  the  desired  confidence  and 
is  proportional  to  the  square  root  of  (46)  .  [The  asymptotic  independence  of  the  ML 
log-magnitude  and  phase  estimates,  and  their  common  asymptotic  variance,  imply  that 
the  real  and  imaginary  parts  of  H^/H^  have  asymptotic  statistical  properties  like  those 
of  (8,  9).  ] 

We  close  this  study  with  a  comparison  of  (46)  against  (38)  and  (39),  to 

see  how  the  asymptotic  performance  of  the  ML  estimate  relates  to  that  of  the  ad  hoc 

estimate  (9)  —  (10).  The  first  term  of  (46)  may  be  shown  through  the  Schwarz 
» 

inequality  to  never  exceed  either  of  the  (common)  first  terms  of  (38)  and  (39)  , 
while  if  the  {  |pj  cos  (0  —  9^)}  happen  to  be  the  same  for  all  i,  the  second  term  of 
(46)  is  by  inspection  less  than  or  equal  to  either  of  the  second  terms  of  (38)  and 
(39)  .  (Note:  R2  +R"3  ^  2  for  all  positive  R;  also,  when  all  the  {  vanish,  (39) 
equals  (46)  ,  as  it  should  since  ( 10  )  then  happens  to  be  the  ML  estimator. )  Thus, 
under  these  circumstances  the  ML  estimate  is  uniformly  better*  than  the  ad  hoc 
estimate,  at  least  asymptotically. 

If,  however,  the  {  |p.  |  cos  (9  —  0.)}  (which  are  the  projections  of  the  complex 
noise  correlations  on  the  H^/U^  vector)  are  permitted  to  depend  on  i  (this  is  the  actual 
seismic  situation),  the  ML  estimate  can  be  poorer  than  the  ad  hoc,  and  to  an  unlimited 
degree  (while  both  nonetheless  remain  in  their  asymptotic  regions).  To  illustrate,  let 
us  suppose  that  N  =  2,  R  =  iHj/l^l  =  1,  9^  =  ©2  =  9,  =  0  =  p^*  Then  for  the  variance 

of  the  ML  estimate  of  either  the  log -magnitude  or  the  phase,  we  have  from  (46) 

*  It  cannot  be  hoped  that  the  ML  estimate  is  uniformly  better  than  all  other  estimates, 
even  asymptotically,  because  of  the  presence  of  the  unknown  "incidental"  parameters 
{ X-j^ } --certainly,  other  estimators  exist  that  will,  by  chance,  be  better  "tuned"  to  a 
particular  set  of  {Xj}  than  the  ML  estimator.  For  example,  the  ML  procedure  uses 
all  observations,  irrespective  of  whether  they  actually  contain  probing  energy  or  not; 
other  estimators  may  fortuitously  reject  such  observations.  Perhaps  the  ML  esti¬ 
mator  is  best  in  some  minimax  sense,  but  this  remains  to  be  shown- -for  studies  of  the 
effects  of  incidental  parameters,  see  the  papers  by  Neyman  and  Scott,  and  by  Kiefer 
and  Wolfowitz,  referenced  by  K/S. 
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(3  -  6)2  =  |X2H1|"2  [1  +  tX2Hir2/(l- lp1l)] 


while  for  the  ad  hoc  estimates  we  have  from  (38)  —  (39) 
o^/R2  =  |X2Hir2[l+|X2Hir2(l-J^i)] 

,  o®  =  ix2H1r2[i+ix2H1r2(i+ip1i-  JyJ-)i 

As  |pj  -*  1,  |X2H^ |  must  clearly  be  increased  much  more  for  ML  estimation  than  for 
ad  hoc  estimation,  to  obtain  comparable  performance  (the  value  of  |X2H^|  being  great 
enough  that  the  variances  are  asymptotically  small  in  either  case).  Of  course,  were 
we  somehow  to  know  that  X^  =  0  in  performing  the  estimation,  we  would  naturally 
ignore  the  first  pair  of  observations,  then  obtaining  [  l  +  |X2HjJ  2  /  2  ]  as  the 

common  variance  for  the  log-magnitude  and  phase  estimation  by  both  the  ML  and 
ad  hoc  methods.  This,  however,  would  be  clairvoyance. 
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